首页> 外文OA文献 >Comparison of the Predictive Accuracy of DNA Array-Based Multigene Classifiers across cDNA Arrays and Affymetrix GeneChips
【2h】

Comparison of the Predictive Accuracy of DNA Array-Based Multigene Classifiers across cDNA Arrays and Affymetrix GeneChips

机译:比较基于cDNA阵列和Affymetrix基因芯片的基于DNA阵列的多基因分类器的预测准确性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We examined how well differentially expressed genes and multigene outcome classifiers retain their class-discriminating values when tested on data generated by different transcriptional profiling platforms. RNA from 33 stage I-III breast cancers was hybridized to both Affymetrix GeneChip and Millennium Pharmaceuticals cDNA arrays. Only 30% of all corresponding gene expression measurements on the two platforms had Pearson correlation coefficient r ≥ 0.7 when UniGene was used to match probes. There was substantial variation in correlation between different Affymetrix probe sets matched to the same cDNA probe. When cDNA and Affymetrix probes were matched by basic local alignment tool (BLAST) sequence identity, the correlation increased substantially. We identified 182 genes in the Affymetrix and 45 in the cDNA data (including 17 common genes) that accurately separated 91% of cases in supervised hierarchical clustering in each data set. Cross-platform testing of these informative genes resulted in lower clustering accuracy of 45 and 79%, respectively. Several sets of accurate five-gene classifiers were developed on each platform using linear discriminant analysis. The best 100 classifiers showed average misclassification error rate of 2% on the original data that rose to 19.5% when tested on data from the other platform. Random five-gene classifiers showed misclassification error rate of 33%. We conclude that multigene predictors optimized for one platform lose accuracy when applied to data from another platform due to missing genes and sequence differences in probes that result in differing measurements for the same gene.
机译:我们检查了差异表达基因和多基因结果分类器在不同转录谱分析平台生成的数据上进行测试时如何保持其区分类别的值。将来自33个I-III期乳腺癌的RNA与Affymetrix GeneChip和Millennium Pharmaceuticals cDNA阵列杂交。当使用UniGene匹配探针时,在两个平台上所有相应的基因表达测量结果中,只有30%的Pearson相关系数r≥0.7。与相同cDNA探针匹配的不同Affymetrix探针组之间的相关性存在很大差异。当cDNA和Affymetrix探针通过基本的局部比对工具(BLAST)序列同一性进行匹配时,相关性显着提高。我们在Affymetrix中鉴定了182个基因,在cDNA数据中鉴定了45个基因(包括17个常见基因),这些基因在每个数据集中有监督的层次聚类中准确地分隔了91%的病例。这些信息基因的跨平台测试分别导致较低的聚类准确性,分别为45%和79%。使用线性判别分析在每个平台上开发了几套准确的五基因分类器。最好的100个分类器显示原始数据的平均错误分类错误率为2%,而在其他平台的数据上进行测试时则上升到19.5%。随机五基因分类器的错误分类错误率为33%。我们得出的结论是,由于缺少基因和探针序列差异导致针对同一基因的不同测量结果,针对一个平台优化的多基因预测因子在应用于来自另一平台的数据时会失去准确性。

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号